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Abstract 

This paper estimates the relationship between initial 
village inequality and subsequent household income 
growth for a large sample of households in rural China. 
Using a rich longitudinal survey spanning the years 
1987-2002, and controlling for an array of household 
and village characteristics, the paper finds that households 
located in higher inequality villages experienced 
significantly lower income growth through the 1990s. 
However, local inequality's predictive power and effects 
are significantly diminished by the end of the sample. 
The paper exploits several advantages of the household- 
level data to explore hypotheses that shed light on the 



channels by which inequality affects growth. Biases due 
to aggregation and heterogeneity of returns to own- 
resources, previously suggested as candidate explanations 
for the relationship, are both ruled out. Instead, the 
evidence points to unobserved village institutions at 
the time of economic reforms that were associated with 
household access to higher income activities as the source 
of the link between inequality and growth. The empirical 
analysis addresses a number of pertinent econometric 
issues including measurement error and attrition, but 
underscores others that are likely to be intractable for all 
investigations of the inequality-growth relationship. 



This paper — a product of the Human Development and Public Services Team, Development Research Group — is part 
of a larger effort in the department to understand the relationship between inequality and economic performance. Policy 
Research Working Papers are also posted on the Web at http://econ.worldbank.org. The author may be contacted at 
jgiles@worldbank.org. 



Tlje Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development 
issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the 
names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those 
of the authors. They do not necessarily represent the views of the Tnternational Bank for Reconstruction and Development! World Bank and 
its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. 



Produced by the Research Support Team 



Did Higher Inequality Impede Growth in Rural China? f 



Dwayne Benjamin 
University of Toronto 

Loren Brandt 
University of Toronto 

John Giles 
The World Bank 



JEL Classification: 012, 015, P20 
Keywords: Inequality, growth, rural China 



+This is a completely rewritten version of the previous working papers, "Inequality and Growth in Rural 
China: Does Higher Inequality Impede Growth," (University of Toronto Working Paper #237, June 2006; 
and Institute for the Study of Labor (IZA) Discussion Paper 2344, September 2006). We thank Gustavo 
Bobonis for helpful comments, and the editor and referees for constructive suggestions for this revision. 
Benjamin and Brandt gratefully acknowledge the Social Sciences and Humanities Research Council of 
Canada for financial support, and Giles acknowledges support from the National Science Foundation 
(SES-0214702). 



1.0 Introduction 



For researchers estimating the effect of inequality on growth, China would seem a promising 
"laboratory." Since the start of economic reform in the early 1980s, it experienced plenty of both: Per 
capita income has grown nearly 8 percent annually, while the Gini coefficient rose from 0.28 to 0.39 
(Ravallion and Chen, 2007). There was also significant within-China variation in growth and inequality at 
the local level (Benjamin, Brandt, and Giles 2005). Experience with cross-country aggregate data, 
however, demonstrates that estimating a robust correlation, let alone a causal relationship between 
inequality and growth, faces major empirical challenges, some of which stem from less than ideal data. 1 
Better data alone, however, cannot solve the causality problem. Kuznets (1955) suggests the opposite 
chain of causality, from growth to changes in the income distribution; and almost certainly, unobserved 
heterogeneity is a potential factor, with inequality reflecting other factors that drive growth. Finally, even 
if we can estimate a "reduced form" effect of inequality on growth, it may still be impossible to identify 
the channels through which it matters. 

In this paper we use the post-reform experience of rural China to determine whether local 
inequality impeded the growth of household incomes. We are able to address some of the methodological 
problems that plague cross-country data. By using a repeated, consistently applied household survey we 
avoid some of the measurement problems endemic in the cross-country setting. At least compared to 
international variation, the relative similarity of local institutions across villages in China also permits a 
cleaner isolation of the impact of inequality from other unobserved factors. At the same time, there are 
sufficient spatial differences in institutions and inequality that exploring the experiences of rural 
households scattered across villages can inform us about the potential channels by which inequality 
affects income growth. In particular, we can distinguish between two broad classes of explanations that 
have been offered as to why inequality affects growth: imperfect factor markets, including credit, or 
growth-inhibiting institutions. 

The foundation of this paper is a rich panel that tracks household incomes from early in the 
reform period (1987) to nearly the present (2002). The data allow us to link the detailed trajectories of 
household income to initial village and household conditions. Our question is simple: Controlling for a 
rich set of covariates, did higher village inequality dampen household income growth? There are several 



1 The potential problems are numerous: cross-country heterogeneity of data and measurement standards, problems of 
aggregation, measurement error of key variables, short time-series' of inequality and growth, and less-than- 
comparable estimates of inequality. 
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advantages to using household-level data. 2 Most importantly, we can control for a host of household-level 
variables-notably flexible functions of initial household income-that may be confounded with local 
inequality. This helps rule out some potential explanations for the inequality-growth relationship, in 
particular those that rely on the aggregation of non-linear effects of own-income. With household-level 
data we are also able to explore "within-village" heterogeneity of the impact of inequality on growth: Are 
the poor hurt more than the rich? Are households with higher educated members immune to the impact? 
We are also able to broaden the set of outcomes from "income growth" to other economic variables, like 
the composition of income, e.g., concentration in agriculture, or participation in off-farm employment, 
that inform the question of how inequality may affect growth. 

Repeated observations at the village-level allow us to evaluate the stability and consistency of the 
"treatment" of higher inequality: We examine whether all variation in inequality is the same. First, we 
explore the impact of cross-sectional differences of initial inequality across villages. We then determine 
whether rising inequality within villages, or changing inequality across villages, affects household growth 
in the same way. These multiple sources of "treatment" provide identifying information on the nature of 
any causal relationship between inequality and growth. The village-dimension of the panel also allows us 
to address whether unobserved heterogeneity confounds the impact of initial inequality. 

While we are able to provide robust estimates of the correlation between village inequality at the 
outset of reforms and subsequent growth, there remain disheartening limits as to what we can learn from 
even these data. Paramount among these limits, the endogeniety problem is intractable: Solving it requires 
finding instruments that predict initial income inequality, but are otherwise excludable from a growth 
equation, and in particular are uncorrelated with any institutions that may be related to subsequent 
growth. The next best thing is to trace the correlation of income inequality through various observable 
institutional channels that affect growth. While we previously attempted this (Benjamin, Brandt, and 
Giles, 2006), we were unable to find any robust linkages between inequality and measured institutions at 
the outset of reforms. Again, this problem is thorny to solve, as initial inequality in a village in 1987 will 
reflect not just the immediate institutional structure of the reform period, but the entire history of the 
village throughout the occasionally tumultuous post- 1949 period, e.g. land reform, collectivization, the 
Great Leap Forward, the Cultural Revolution, as well as its pre- 1949 socio-economic structures. 
Moreover, "village" inequality will reflect "local" conditions beyond the village, at the township and even 
county-level, making it difficult to pin down the precise institutional mechanism. 



2 Ferreira (2010), among others, has highlighted the importance of using micro-level data to shed light on those 
mechanisms driving the growth-inequality relationship that cannot be addressed by the cross-country 
macroeconomics literature. 
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Despite these limitations we are able to establish robust patterns in the data that are suggestive of 
the channels by which inequality operates - and which it does not. First, we find that initial inequality has 
a robust negative effect on household income growth that is impossible to dismiss. Second, the effect of 
inequality fades over time: As villages became more integrated with the wider economy, the influences of 
initial conditions on trajectories were "swamped" by rapidly expanding external opportunities, and 
possibly local institutional change. Third, we find that only inequality from the very beginning (1987) 
matters, at least for the 15 -year period that we observe. There is no evidence that generally rising 
inequality, or changes of inequality across villages, have an impact on household income growth. Fourth, 
we find that education and access to off-farm opportunities play a critical role in this relationship: Better- 
educated individuals are less affected by the adverse impact of inequality, and households in more equal 
villages are better able to move into off-farm wage employment. Fifth, setting aside its link with 
education, we find that high inequality is an "equal opportunity" growth inhibitor: rich and poor alike in 
high inequality villages suffer a growth penalty. 

Overall, we interpret our results as suggesting that the effect of inequality reflects something 
fundamental about the economic and institutional characteristics of villages at the outset of reforms that 
shaped economic opportunity for all households. Credit market stories that we expect to be reflected in 
differences across household income strata, or in the effect of changes in inequality on growth over time, 
seem much less likely. At least in the Chinese context, institution-based hypotheses linking inequality to 
growth are most relevant. 

We begin with a brief review of the reasons why inequality is believed to affect growth, and why 
these factors are relevant in rural China. Next, we provide a formal overview of our empirical framework, 
highlighting two main issues. First, we discuss the identification problem, underscoring the difficulty of 
finding naturally occurring variation that could ever substitute for a proper experiment. There are other 
empirical problems, however, that we can address, including panel-attrition, aggregation, and 
measurement error. Second, we provide a detailed explanation of the links between our household-level 
specification and the aggregate-level regression commonly employed in the literature. After describing 
the data, we present our core empirical results: estimates of the effect of village inequality on household 
income growth, and how this evolves over time. As part of this exercise, we line up the household and 
village-level evidence. We then explore dimensions of the heterogeneity in the response of household 
growth to inequality, focusing on education, household age, and initial household income. The third set of 
results concerns the impact of inequality on the evolution of village economic structure, and household 
participation in agriculture, wage labor, and family businesses. Our last set of results addresses issues of 
dynamics, where we exploit the co-evolution of inequality and growth at the village-level to estimate a 
series of panel-data specifications that underscore the difficulty of drawing strong conclusions about the 
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general relationship between inequality and growth, from these, and almost certainly other data sets. The 
final section draws together our conclusions. 

2.0 Why Might Village Inequality Affect Growth? 

There are three conventional classes of explanation: imperfect credit markets, imperfect factor 
markets, and political economy. 3 In the first class of explanations, credit market constraints tie the ability 
of households to exploit opportunities for growth to their own resources. As the poorest of households 
have the fewest resources, holding average village incomes fixed, unequal villages have more resource- 
constrained households. If credit markets are fully developed, the relationship breaks down, as the 
distribution of own resources no longer determines the distribution of household growth rates. A second 
possible channel is through factor markets. Higher income inequality (or inequality of land, human, or 
physical capital) may be associated with imperfect competition or other impediments to factor market 
development that limit opportunities for trade, especially for the poor. 4 

The most common channel in the cross-country literature, however, implies the mechanism is 
through local political economy: unequal communities make different collective choices that affect the 
growth potential of households. For example, high- inequality communities may adopt more progressive 
tax structures, as low-income households pressure for redistribution in ways that inhibit growth. This is 
the conventional taxation-based story offered by Alesina and Rodrik (1994), Benabou (1996), and 
Persson and Tabellini (1994). Alternatively, the greater homogeneity (equality) of households may 
facilitate consensus for more efficient tax systems, and higher investment in public goods and services. 5 
Localities also play an important role in targeting assistance to the poor, and international evidence 
suggests that there may be greater leakage, and thus poorer targeting in more unequal communities. 6 



3 See Aghion, Caroli, and Garcia-Penalosaand (1999), Lloyd-Elllis (2003), and Perotti (1996) for excellent 
summaries of the cross-country inequality and growth literature, with detailed discussions of the theoretical linkages 
between inequality and growth. 

4 Notable examples include: Galor and Zeira (1993), Besley and Burgess (2000), Banerjee, et al (2001), Banerjee, 
Gertler, and Ghatak (2002), Galor and Moav (2004), Banerjee and Iyer (2005), and Besley et al (2010). 

5 Several papers suggest that collective action and provision of public goods may be complicated by high levels of 
inequality within communities. See for example, Alesina and La Ferrara (2000), Bardhan, Ghatak and Karaivanov 
(2007), Dayton- Johnson (2000), Araujo, et al (2008), and Khwaja (2009). With respect to public finance, Sokoloff 
and Zolt (2005) find that high inequality is correlated with more regressive taxes, and less funding of local public 
investments and services. Glaeser (2006) reviews evidence suggesting that unequal societies are less likely to have 
governments that respect property rights. Acemoglu, et al (2008) show that in the case of Columbia, it was political, 
not economic (land) inequality that adversely affected long run outcomes, further reinforcing the importance of this 
class of explanations. 

6 See, for example, Baird et al (2009), Bardhan et al (2008), Galasso and Ravallion (2005), Shankar et al (2010). 
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There may also be strong links between the distribution of income, levels of education, and the provision 
of public education. 7 

To what extent can we expect any of these factors to be important at the village, or local, level in 
rural China? In principle, all three could have been important. At the outset of reforms, formal sources of 
credits were limited in Chinese villages. Factor markets were also poorly developed. Land for farming, 
for example, was allocated administratively, with limited opportunities for either land rental or the hiring 
of labor among households (Benjamin and Brandt, 2002). Migration to the cities, and even other villages, 
was restricted by the household registration, or hukou, system. Last, from the perspective of political 
economy explanations, the village was administratively important. 8 Over the period we examine, village 
governments controlled policy levers that could affect household incomes: They oversaw the allocation of 
land use rights for cultivated land to households as part of the Household Responsibility System (HRS), 
and exercised control over the allocation and management of other collective assets such as forestry and 
village-run enterprises. 9 They also had taxing authority, and until the 2002 Tax-for-Fee Reform, were the 
most important provider of local public goods, including primary education, agricultural infrastructure 
and health care. 10 Finally, village leaders played an important role in targeting poor households for 
assistance. 11 With individual and household geographic mobility severely limited, household income 
opportunities were heavily shaped by local policy. And in this context, inequality at the village level may 
have had an effect on the evolution of household incomes through village policy. Inequality at the village 
level was also likely correlated with governance structures at the township and county level that mattered 
more broadly for economic policy. 12 



7 See, for example, Benabou (1996) or Lloyd-Ellis (2000). This channel may be especially important if there are 
externalities associated with the distribution of education in the economy (e.g., Acemoglu (1996)). Note that, as in 
Galor, Moav, and Vollrath (2009), imperfect credit markets and political economy channels may interact to give rise 
to underinvestment in human capital. 

8 Villages are at the lowest rung of the rural administrative hierarchy. Above villages, township and county 
governments have authority over some fees and taxes, and above the county lies the provincial government. The 
Village Organic Law of 1988 formally recognized village "self-government." 

9 Early in the reform process it was likely that inequality was heavily influenced by how collectivization proceeded. 
The allocation of use-rights of land to households was complete by 1983. By all indications, the distribution of 
cultivated land was fairly egalitarian. Reports suggest, however, that this was much less the case with respect to the 
allocation and sale of other collective assets. 

10 See Zhang, Yan, Brandt, and Rozelle (2005), and Fan, Zhang, and Zhang (2004). Jalan and Ravallion (2002) 
show that local levels of income, and associated public investments (e.g. roads and health care) are positively related 
to household consumption growth, and help explain the existence of "geographic poverty traps," whereby 
households in poorer areas of rural China experience lower growth than those in richer areas. 

11 Over the period covered by this panel, this has included assistance through wu baohu programs for those unable to 
work and employment in food for work programs (Park, Wang, and Wu, 2002). 

12 Those localities in which local cadre were able to capture the rents associated with de-collectivization, for 
example, were often those in which off-farm activity was effectively discouraged through excessive taxation by 
village, township and county governments (Oi, 1989). 
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Thus, a case can be made that differences in local economic conditions and village institutions 
paralleled some of those across countries, including along dimensions believed to link inequality to 
growth. Over time, however, Chinese villages have become less isolated, and access to new markets and 
opportunities, e.g. through migration, has expanded. Factors that were key determinants of income and 
institutions in the 1980s may be less important now. Governance reforms such as those associated with 
the implementation of the Village Organic Law may have also narrowed some institutional differences 
across villages (Martinez-Bravo et al, 2010). These dynamics themselves may be informative about the 
processes linking inequality and growth. 

3.0 Empirical Framework 

3. 1 Inequality and Growth at the Household-Level 

Our core analysis is based on the household-level specification: 

8„t = ln )V " ln = «o + «i ln + <*2**y«- m + a A,-m + 7' X i, v ,-i + P'X«-i + M «>r (1) 

where g i vT is the (average) growth rate of per capita income for household i in village v between the 
initial period, t — l and the terminal period, T. This is a structural model relating household growth to 
own-household initial resources, lny, >( _, , and the distribution of resources across other households in the 
village. We summarize two dimensions of this distribution by the mean log incomes of other households 
besides household i, lny w _ 1(i) , and the level of inequality (i.e., the Gini coefficient) among other 

households, IQ vt _ m ■ For notational convenience we use a "tilde" to denote a statistic calculated over all 

village households, excluding household i. We also include controls for both household and village-level 
covariates from the initial period, X i vl _ t and X vt _ { . 

Equation (1) has several inherent advantages over previous specifications. Most importantly, 
compared to typical specifications based on aggregate data, we are better able to distinguish among the 
competing explanations for the growth-inequality relationship. The inclusion of household-level 
covariates, notably flexible controls for initial household income, i.e., polynomials of lny, >( _,, helps to 

minimize the influence of potentially omitted non-linearities of own-income, and reduces the possibility 
that inequality reflects the spurious effects of aggregation. Use of the "leave-one-out," or jackknifed 
inequality index also better highlights the nature of the "treatment" we wish to isolate: the purely external 
effect of inequality. 
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Imagine a Chinese "Robin Hood" stealing from a rich family, and giving to a poor one, while 
leaving household i untouched. 13 This reduces overall inequality, without changing average village 
income, or the income of household i. Why might this redistribution affect the growth trajectory of 
household ft If imperfect credit markets are the only source of the inequality-growth relationship, then in 
the household-level specification IQ vt _ m should not be significant because we are controlling for 

household own-resources. Under the credit-market explanation, the distribution of household income 
among one's neighbors has no independent effect on a household's own-growth. Indeed, the credit- 
market explanation is only relevant for the aggregate (village-level) specification. As noted by Deaton 
(2003) in the context of inequality and health, and Ravallion (1998) for inequality and growth at the 
county-level in China, some of the key reasons why inequality might be correlated with average outcomes 
are pure artifacts of aggregation: inequality is a proxy for heterogeneity of household resources, or returns 
to resources, that are captured at the aggregate- level. Estimation at the household-level therefore allows 
us directly to determine whether village inequality has an external effect on household growth. If it does, 
this provides strong evidence that factors besides imperfect credit markets are the source of the 
relationship. If we also control for own household endowments of land, human capital and labor, we 
reduce the chance that imperfect factor markets drive the inequality-growth relationship. A significant 
effect of IQ„_ m will therefore point towards the political economy, or institution-based class of 
explanations. 

Another benefit of the household-level specification is that it enables us to explore heterogeneity 
in the response of growth to income inequality. This exercise may inform us as to the mechanism by 
which inequality affects household income growth. For example, if inequality hurts the poor more than 
the rich, one interpretation is that own-resources mitigate the adverse impact of inequality, consistent with 
a model of imperfect credit markets. Of course, political economy models can also generate the prediction 
that the rich are less harmed by institutional failure than the poor, especially if the institutional failure is 
self-serving. On the other hand, a general failure to provide growth-enhancing public goods, or the 
imposition of taxes that discourages a shift to non-agricultural pursuits, might affect all village residents 
similarly. If inequality affects everyone similarly, the public-good oriented explanations seem more 
plausible than the credit-market models. 



13 One candidate for "Chinese Robin Hood" is Song Jiang (SjSyl) and the 108 bandits from Mount Liang (^|il) who 
feature in the Chinese literary classic The Water Margin (^Ryiff^) (Shi and Luo, 1365c). The Water Margin is also 
known under the following alternative English titles as All Men are Brothers (Buck, 1933), Outlaws of the Marsh 
(Shapiro, 1981), and The 108 Heroes. 
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3.2 Econometric Issues: Endogeneity 

In order to make Equation (1) operational, we provide more details of the estimation, and 
evaluate the statistical assumptions required to draw causal inferences concerning the impact of IQ vt _ m 

on growth. Note first one seemingly minor departure in (1) from the usual aggregate-level specification: 
the village mean income is the average of log household per capita incomes, not the log of average 
household per capita incomes. This distinction becomes important in aggregation, and linking the 
household parameter a, to the village-level coefficient on inequality. We also need to specify an index of 
income inequality, IQ vl _ t . Typically, researchers employ the Gini coefficient, and for comparability we 
also report results with the Gini. However, our main index of inequality is the "Mean Log Deviation," 
which is defined as MLD vt _ { = In y w _, - In y , the difference between the log of mean income, and the 

mean of log income. This is the same measure as used by Ravallion (1998) in his exploration of the 
consequences of aggregation in estimating the growth-inequality relationship. 14 We use this measure of 
inequality because it is highly convenient for aggregating from the household to the village-level results, 
and for addressing other statistical issues. We note, however, that the empirical results are not sensitive to 
the choice of inequality measure. 

There are at least three classes of econometric issues that need to be addressed. The first two are a 
consequence of less than perfect data, and the third of the non-randomization of inequality: 

Measurement Error. There are several ways for measurement error, especially of household 
income, to generate spurious links between initial inequality and subsequent growth. For example, we 
might poorly control for own-initial income. Village summary statistics, like inequality, which are 
correlated with own-household income, may then pick-up some of the effect of own-income on growth. 
To some extent, this is mitigated by our use of the jackknifed IQ„_ m that excludes own-household 

income. Indeed, use of the jackknifed IQ„_ Ki) breaks potential mechanical and small-sample bias links 

between the village-level statistics and household i outcomes, and helps address some of the measurement 
error that arises from using group means as regressors. In Appendix 1 we show that Equation (1) is part of 
the reduced-form for Deaton's suggested estimator for correcting measurement error with group means. 
Furthermore, we include a rich set of household- level covariates, such as household endowments of land, 
labor, and human capital, that should reduce the extent to which the error term contains mis-measured 
household-level "growth potential" correlated with IQ vt _ m . More conventionally, however, mis- 



Our specification departs from Ravallion (1998) in two key ways. First, we use the "leave-one-out" version of the 
inequality index, MLDu-m = lny w _ 1(i) -lny v( _ 1(j) . Second, we use the mean of log incomes, not the log of mean 
incomes, as our control for the "level" of village income (our measure of this is also jackknifed). 
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measurement of household income may simply contaminate our estimate of initial inequality IQ„_ m . For 

example, "outlier" households may generate a high-level of apparent inequality combined with high 
initial income. With mean-reversion of household incomes, such villages will appear to have lower 
growth rates, even if there is no link between inequality and growth. Alternatively, the greater degree of 
noise in IQ vt _ m may result in conventional attenuation bias. In principle, this type of measurement error 
is relatively straightforward to address with instrumental variables. The most obvious candidate 
instrument for IQ vt _ m is an alternative measure of inequality that is robust to outliers and measurement 

error. We use the 90-10 ratio for this purpose, jackknifed in the same way as other village-level variables, 

Trps ~9o /~io 

Attrition. Our best estimates of village-level initial conditions are based on the largest sample of 
households surveyed in 1987, which because of attrition is significantly larger than the balanced panel of 
households observed continuously between 1987 and 2002. The attrition rate is about 30 percent over the 
15 years. While this hurts the sample size, our greater concern is that household attrition may be 
correlated with both initial inequality and subsequent growth. For example, there may be selective out- 
migration, with households more likely to leave slow-growing villages. The key question is whether such 
migration is correlated with initial inequality. If out-migration was more common in the high-inequality, 
low-growth potential villages, then depending on which households leave, we might observe a spurious 
link between initial inequality and the growth rates calculated on the basis of the initial sample collected 
in 1987, and the revised (attrition-affected) sample in 2002. In our empirical work, we explore the issue of 
attrition in detail, and ultimately present our main results adopting the "Inverse Probability Weighting" 
procedure recommended by Wooldridge (2002). 

Omitted Variables Bias. Our greatest concern is that even measured perfectly, IQ vt _ m may be 

correlated with a village-level unobservable component of u. vT , which we denote as w _j . No matter how 

many covariates we include to control for initial conditions, there would always be the suspicion that 
inclusion of a better proxy for V ,_, could eliminate the apparent impact of inequality (e.g., see Kanbur 

(2005) for a summary). Perhaps the initial income distribution is related to policies in place at t - 1 that 
affect future growth? What characteristic of W _, might be of particular concern? Suppose that V( _, is a 

long-standing village taste, or predisposition, for low inequality. This is of concern only if vM is also 

related to growth, for example, if egalitarian villages are more likely to invest in growth-enhancing public 
goods like schools, or to keep growth-distorting taxation to a minimum. From an interpretation 
standpoint, if vW drives the inequality-growth relationship, then our conclusions will only be accurate 

from an "historical" descriptive perspective: unequal villages in our sample grew more slowly. If it is the 
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underlying taste for low inequality ( V( _, ) that drives growth, then a Robin Hood "intervention" would not 

affect the growth trajectory: Controlling for W _, , the "treatment" of lower inequality would have no 

impact on growth. Rising inequality, similarly, would be of no consequence to growth. So while we could 
confidently conclude that low inequality villages historically grew faster, if unobserved "egalitarianism" 
was the driving force, we could not draw conclusions concerning the present-day increase in rural 
inequality. Ideally, we need to disentangle the impact of actual inequality from unobserved "tastes" for 
low inequality. 

One way to accomplish this is to find instruments that help predict initial inequality, but are 
independent of V( _, . In Benjamin, Brandt, and Giles (2006), we attempted such a strategy, using as 

instruments the asset distribution in period t - 1, especially the initial distribution of land. However, it is 
very difficult to plausibly claim that the land distribution is independent of V ,_, , as one expects that a 

taste for egalitiarianism, for example, also spills over to the land distribution, which was under significant 
control of the village government, and thus likely a function of V( _, . Indeed, it is difficult to imagine any 

set of instruments succeeding. 15 Moreover, even if we could find correlates of initial inequality that are 
independent of V( _, , it is hard to rule out just about any variable as potentially contributing to household 

growth (i.e., being related to household own-resources), and thus violating the exclusion restriction. 

In the end, we do not attempt to find such instruments, but choose instead to live with the 
important qualification that initial income inequality may be correlated with other factors at the village- 
level (or possibly the township or county-level). Indeed, the interpretation of inequality as causally 
determining institutional development, and thus growth, is predicated on such a correlation. While there 
are assumptions under which inequality alone can be interpreted purely causally, at the very least, we can 
determine whether high inequality is a "marker" or predictor of low growth potential. Besides including 
as many covariates as possible, another way to address the unobserved heterogeneity problem is to treat 
V( _, as a fixed effect, using repeated observations of village inequality and growth to implement a 

village-panel based estimation procedure. We conduct this exercise, but note at the outset that it is subject 
to important, and well-recognized limitations. 



15 Several papers have tried to follow-up on the insights of Engermann and Sokoloff (1997) that the distribution of 
factor endowments, especially land, may drive subsequent institutional development, inequality and growth. See, for 
example, Easterly (2007) and Galor, Omer, and Vollrath (2009). Lundberg and Squire (2003) also use the 
distribution of land as an instrument for inequality in a growth-inequality specification. As noted by several authors, 
however (e.g., Chong and Gradstein (2007a)), inequality (of income and productive assets) and institutions probably 
co-evolve, mutually affecting each other in various ways. It is almost impossible to imagine how one could be taken 
as exogenous relative to the other. 
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3.3 A Note on Aggregation 

For comparability with the previous literature, and to serve as a foundation for the village-panel 
estimation, we also show results for the village analog of (1) (setting aside the jackknife dimension): 

g vT =ln JV-ln>V> =/5 + A ln >' w -i+^( ln ^- 1 -ln>' v( _ 1 ) + f^_ 1 + £ vr (2) 

In Appendix 1 we show in detail how this equation can be aggregated from a household version (like (1)), 
and thus how the coefficients can be compared across household and village-level specifications. A few 
key points are worth highlighting. First, given our chosen functional form, it is easy to aggregate from the 
household-level regression by simple averaging, which is equivalent to using either village means, or 
village dummies, as instruments for the appropriate household specification. The estimated effect of 
inequality will be numerically identical, whether estimated at the household or village level. Second, our 
specification allows us to better highlight the distinction between the village and household-level 
analysis: Important differences in key parameter estimates emerge in the details of empirical 
implementation, especially the loss of household-level controls, and the use of an internally consistent 
sample that properly aggregates (i.e., only the non-attritted, balanced panel). Third, following the insights 
of Devereux (2007), we show that our jackknifed specification (1) can be derived as the reduced form 
growth equation for Deaton's (1985) measurement-error-correcting estimator of the village-level 
specification (2). This provides "bonus" justification for using Equation (1) as our base specification. We 
are thus able to also implement the Deaton (1985) estimator for the village-level Equation (2). 

4.0 Empirical Results 

4. 1 Data 

The data come from annual household surveys conducted by the Survey Department of the 
Research Center on the Rural Economy (RCRE). The survey collected detailed household-level 
information on income and other household characteristics. The starting point is a sample from 1987 of 
4,847 households drawn from 82 continuously observed villages in nine provinces. 16 Originally planned 
as an annual longitudinal survey, by the end of our sample (2002) there was significant attrition of 
households, on the order of 30 percent. 17 We are able to follow a "pure" balanced panel of 3,424 
households every year between 1987 and 2002, excluding 1992 and 1994 when there was no survey. 



16 The complete RCRE survey covers over 22,000 households in 300 villages in 31 provinces and administrative 
regions. RCRE's complete national survey is 31 percent of the annual size of the NBS Rural Household Survey. By 
agreement, we have obtained access to data from 9 provinces (Anhui, Gansu, Guangdong, Henan, Hunan, Jiangsu, 
Jilin, Shanxi and Sichuan), or roughly one third of the RCRE survey. 

17 The RCRE survey continued past 2002; however, there were significant changes in survey design from 2003 that 
introduced serious comparability problems for income. The post-2002 data cannot be pooled with previous surveys. 
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A variety of definitions are useful. First, household membership and the corresponding 
sampling frame are defined on the basis of residency and registration. Second, income is calculated as the 
sum of net income (gross revenue less current expenditures) from agriculture, farming sidelines (e.g. 
animal husbandry and livestock), family-run businesses, plus wage income (earned inside and outside the 
village), and transfers. We calculate the value of farm output that is not sold, and thus largely consumed 
by the household, at market prices. We also use household income before taxes, and deflate all nominal 
values into 1986 prices using the NBS rural consumer price index for each province. 

The key outcome in our analysis is "growth" of household per capita income. To mitigate the 
effect of transitory shocks, or measurement error, on income, we construct two-year averages of 
household income for each two-year time period. Our initial period is 1987-88, and all household-level 
variables, and the village statistics calculated from them, are calculated using the average of 1987 and 
1988 outcomes for each household. Subsequent two-year endpoints, "7", include: 1990-91, three years 
after the initial period (1987-88); 1995-96, the next period for which we have adjacent time periods (8 
years out); 1997-98; 1999-2000; and finally 2001-02, fourteen years after the initial period. 

In Table 1, we report the mean and quantiles of key variables for the panel households. Several 
things are noteworthy. First, over the entire period per capita income growth averaged 3.0 percent per 
annum, but there were significant differences in the rates of growth from 1987-88 and our five ending 
dates. This largely reflects cyclical factors: Marked declines in economic activity accompanied the post- 
Tiananmen economic retrenchment and the Asian Financial Crisis. Incomes actually fell between 1987- 
1988 and 1990-1991, but recovered significantly by 1995-96 so that over the period between 1987-88 and 
1995-96, growth averaged 3.6 percent. Growth in the period ending in 1999-2000 was also lower than 
that ending in 1997-1998. Second, there was clear heterogeneity in the success of households, but the 
dispersion in growth rates narrowed steadily over time. At the top end, for example, the 90th percentile 
of growth rates fell from 11.6% for the period ending in 1995-1996 to "only" 8.5 percent by 2001-2002, 
while fewer households experienced negative growth rates. Finally, data in Table 1 also reveal a 
pronounced shift in the structure of incomes. In 1987-88, agriculture accounted for half of household 
income (55%), but by 2001-2002, this had fallen by more than a third to 36%. Offsetting this decline was 
the growing importance of wage income earned locally and from outside the village, which grew from 
only 22% of income to 34%, and a slight increase in the role of income from family businesses. 

In our estimation of Equation (1) we wish to control for as many household covariates as 
possible, and these are described in Table 1. Of particular interest is household education, in this case the 

18 The RCRE fixed point survey team has followed the definitions and protocols established as standards by the 
NBS Rural Household Survey unit since its inception in 1986. Further details and comparisons of the RCRE data 
source with other data from rural China are found in the main text and extensive appendices of Benjamin, Brandt, 
and Giles (2005) 
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years of education for working-age household members. The average in 1987-88 was only 5.56 years of 
schooling, varying from 2.88 to 9.00 years at the 10 th and 90 th percentiles. There was also variation across 
households in factor endowments of labor and land, as summarized by the dependency ratio and 
cultivated land. Finally, note that most households were aged between 31 and 50 years of age in 1987-88. 
This does not represent the complete age distribution in 1987, but reflects the higher attrition of older 
households by 2002. Those too young to head households in 1987 do not appear in our panel either. 

In the bottom part of Table 1 we report the key village characteristics that we include in our 
estimation. Our preferred estimates of these variables are based on all available households in the 1987-88 
sample, i.e., not just the balanced panel of households. 19 Our objective is to consistently capture local 
conditions at that time, and statistics based only on the panel households will be less reliable and sample 
sizes smaller. 20 Our key regressor is village inequality, and we report three measures in Table 1. First, the 
Gini coefficient for an average village was 0.20 in 1987-88. The mean, however, hides the variation of 
inequality across villages; inequality is as low as 0.14 at the 10 th percentile, and more than double at 0.28 
in the 90 th percentile. The Mean Log Deviation shows the same pattern, though it is lower in magnitude. 
The 90-10 ratio, which we use as a more robust indicator of inequality, is 2.56 on average. In the low 
inequality villages, the "rich" (90 th percentile) earn less than double of the "poor" (10 th percentile), while 
in the high inequality villages, the rich earn more than triple the incomes of the poor. Additional village 
controls include village averages of household education, the share of village income derived from the 
main sources (agriculture, wages, and family businesses), and two measures of local public finance: 
Village tax revenue and public expenditures per capita. Our final controls are for topography and 
geography, including a full set of province dummies. 

4.2 Regression Results: Main Household Findings 

In Figure 1 we show a village-level scatter plot of the relationship between initial inequality and 
growth for each potential endpoint. With the exception of the lower average growth rates for the period 
ending in 1990-91, the other four periods show similar average annual growth rates to each other. While 
there is some variation in the slope of the fitted regression, there is also a consistently negative estimated 
relationship between inequality and growth, with the strongest relationship in 1997-98. Furthermore, the 
relationship does not appear to be driven by outliers: no single village or cluster of villages exerts undue 



In some specifications, however, when we wish to illustrate issues of aggregation, we will use village-means and 
variables that are based entirely on an "internally consistent" set of balanced panel households only. 
20 Sample sizes in the full sample range from 15 to 137 households per village, with an average sample size of 60 
households per village, with most villages having between 30 and 90 households. In terms of underlying population, 
villages range in size from approximately 700 people (10 th percentile) to 3,000 people (90 th percentile), with an 
average population of 1,500. 
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influence on the slope of the regression line. The raw correlations therefore suggest that inequality is 
negatively related to growth. 

In Table 2 we present the estimates of our core household-level regression (Equation 1) for each 
possible endpoint. We divide the analysis into two parts. In columns (1) through (6) we explore a few 
important issues of specification, culminating in our "bottom-line" results in column (5). In the second 
part, columns (7) through (1 1), we formally demonstrate the link between our household-level results, and 
the conventional village-level results. Column 1 shows the impact of inequality on growth when we use 
the Gini coefficient as a measure of inequality. The estimated effect of inequality is significant and 
negative for all endpoints, declining from -0.257 in 1990-91, to -0.105 in 2001-02. The decline of the 
impact of inequality is consistent across all specifications, and a key result of the paper. In column (2) we 
move to the Mean Log Deviation as our inequality index. The pattern of coefficients (relative magnitude 
and statistical significance) is identical to that for the Gini. To benchmark the magnitude of the effect, a 
coefficient of 0.20 means that moving from a "high" to "low" inequality village (e.g., a drop in the mean 
log deviation of 0.10) is associated with a 0.02 increase in average annual growth rates. This compares to 
average annual growth rates of 0.035. In column (3) we employ the weights for attrition in order to 
evaluate the impact of this correction. 21 The results change modestly: the estimated effect of inequality is 
generally a bit smaller, to the extent that by 2001-2002, the estimated effect is no longer significant at the 
5% level. 22 Given that the correction for attrition does matter - even if only a bit - we retain this 
throughout the remaining specifications. 

In the next column (4), we address the possibility of measurement error by using the more robust 
90-10 ratio as an instrument for the Mean Log Deviation. The first stage results are highly significant: the 
£-value on the 90-10 ratio is 19.51, so there is no "weak instrument" problem. The strength of the 
instrument is also illustrated in Figure 2, where we plot the Gini and the Mean Log Deviation against the 
90-10 ratio. This figure also illustrates a more important point: all three inequality measures are highly 
correlated with each other. The second-stage results in column (4) show notably higher estimated effects 
of inequality on growth, with a similar temporal pattern as before: steadily declining influence of initial 
inequality, though statistically significant through 2001-02. These results suggest that the Mean Log 
Deviation and the Gini may be noisy measures of income dispersion in the village, and that the more 
robust 90-10 ratio should be preferred (on its own, not merely as an instrument). There is otherwise no 



21 Appendix Table 1 shows the marginal effects from the Probit Model used to calculate inverse probability weights 
based on observable characteristics as suggested in Wooldridge (2002). Note in particular that the probability of 
attrition is not correlated with the initial level of village inequality, which would otherwise be of potential concern. 

22 We formally test whether the inequality coefficients are significant across years. Consider the results from 
Specification (3). We reject the equality of the 5 coefficients, with a p-value of 0.02. Comparing the 1995-96 and 
2001-02 coefficients alone, we marginally reject their equality with a p-value of 0.048. 
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reason to prefer one inequality measure over another, and so for our main results, we show both the Mean 
Log Deviation and the 90-10 ratio. 

Column (5) reports the results comparable to those in column (3), but using the jackknifed Mean 
Log Deviation as the inequality index. 23 Comparing the coefficients in (5) to those in (3), we see that the 
estimated effect of inequality is slightly smaller. The inclusion of household "z" in the village inequality, 
at least in this instance, does not seem to lead to significant bias in the coefficient. We still retain this 
procedure, however, as ex ante there are good reasons to employ the jackknifed statistics. Our estimated 
effect of inequality is statistically significant in all time periods except the most recent, and as in previous 
columns, the effect declines monotonically over time. In column (6) we show estimates comparable to 
(5), but with the jackknifed 90-10 ratio as our inequality measure. The results are similar to those using 
the Mean Log Deviation, though with sharper statistical significance, even including the most recent time 
period, 2001-2002. 

Controlling for a rich array of household and village regressors, the effect of inequality is difficult 
to dismiss: It is robust to controls for sample attrition, small sample bias, and accounting for measurement 
error. While it fades over time, local inequality appears to exert a purely external effect on household 
growth over and above any household-level proxies for growth potential. 

In the remaining columns we show how the household-level results can be directly compared to 
the village-level, following the discussion in Section 3.3 and Appendix 1. The last column, (11), shows 
the village-level results, the final destination of the aggregation exercise: The effects are generally weaker 
than at the household-level, with significant coefficients only for the 1995-96 and 1997-98 endpoints. 
There are a few key steps, however, from the household-level specification in column (2) to the village- 
level in column (11). As it turns out, using village-level instead of household-level measures of growth 
and inequality has nothing to do with the change in results. The first big step is from column (2) to 
column (7), which is estimated at the household-level, but where we drop many of the covariates, notably 
the household-level controls. Compared to column (2), the estimated coefficients are smaller and less 
significant. 24 Next, in column (8), to ensure that the same household observations are included in the 
household and village-level specifications, we re-estimate the household-level regression using village 

23 In Appendix Table 2 we report the coefficients on the other covariates for the specification in column (5). At the 
household-level, we find the most important predictors of household income growth to be age, with younger 
households doing better, and to some extent, education (though only through 1995-96). For 1997-98 and 1999-2000, 
we find that village mean log income is positively related to growth consistent with Ravallion (1998), but otherwise 
its effect also fades over time. Very few other village-level variables matter in terms of statistical significance. In 
that sense, village-level inequality is remarkable as being the most significant village characteristic that matters 
throughout this time period. 

24 Indeed, the differential inclusion of the household-level covariates may also be part of the reason that Ravallion 
(1998) finds more significant results at the household than county-level: Controlling for household characteristics 
affects the precision of estimation, and also addresses possible omitted variables (non-linearities of the effect of 
own-income, etc.). 
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statistics calculated only over the balanced panel of households. The estimated effects are even smaller, 
though statistically significant for the 1995-96 and 1997-98 endpoints. It is the combination of these two 
changes in specification-excluding the household-level covariates and using only the balanced panel 
sample to calculate initial village conditions-that drives the drop in the estimated effect of inequality from 
the household to the village-level regressions: It has nothing to do with aggregation from one level to the 
other per se. 

This is seen most clearly in column (9), where we estimate the household-level Equation (A1.4) 
by instrumental variables, using as instruments the village mean values of Mean Log Income, the Mean 
Log Deviation, and Mean Education. 25 This exercise yields the same results as column (8). We thus 
obtain the same coefficient estimates three ways: Estimation at the household level with either the village- 
level regressors (column 8) or village-means as instruments (column 9); and the village-level results in 
column (11). Finally, in column (10) we implement the Deaton (and Devereux) corrections for small 
sample bias. This entails only a small variation on the specification in column (9): We use the jackknifed 
village means instead of the straight village means as instruments for the household variables in Equation 
A1.4. For this specification, the estimated effects of inequality are slightly smaller, and significant only 
for the "peak" year of 1995-96. Because it accounts for the small-sample measurement error bias, this is 
the preferred "bottom-line" village-level result: inequality has a negative but generally insignificant effect 
that fades quickly after 1995-96. However, our overall preferred specification is column (5), which uses 
the same jackknifed inequality measure, but with the inclusion of the full suite of household covariates, 
and with village statistics calculated using all available observations from 1987-88. As explained in 
Appendix 1, column (5) is a richer "reduced form" for the Deaton and Devereux specification presented 
in column (9). 

4.3 Regression Results: Heterogeneous Responses 

Who is hurt most by high inequality? In Table 3 we present estimates of two variants of Equation 
(1), adding interaction effects between inequality and key dimensions of household covariates: education 
and cohort in one variant, and own-income in the other. For each time period, in column (1), we estimate 
interaction effects for education and cohort. For the cohort interaction, while we maintain the more 
flexible base effects for household age, we define the interaction in terms of "young" and "old", with age 
40 as the cut-off. In column (2) we report interactions between inequality and a household's position in 
the income distribution. We assign households to their quartile within their village, where the quartiles 

25 Identically, we can use Village Dummies as instruments, creating a "Visual Instrumental Variables" estimate 
(intimated by Figure 1). 

26 As usual, the standard errors are slightly higher at the village level, as the Cluster Correction is not a perfect 
substitute for estimating at the village-level. 
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(like the inequality index) are calculated by a jackknife estimator (i.e., quartiles are calculated excluding 
household i). The base regression specification for the exploration of interactions is column (5) from 
Table 2. 

Turning first to the education and cohort interactions, they are jointly and individually 
insignificant for 1995-96, and 2001-02, when the overall effect of inequality is also insignificant. 
However, in the middle two time periods of our 1990s sample (1997-98 and 1999-00), we see that 
education significantly offsets the adverse effect of inequality: the impact of inequality hurts less educated 
households most. The timing coincides with the Asian financial crisis, and suggests that inequality hurt 
the least educated most during those periods, possibly due to reduced migration opportunities. The cohort 
interaction is also significant for 1999-00, and borderline in 1997-98. This interaction coefficient suggests 
that the effect of inequality was worse for the young, which seems somewhat surprising. However, it is 
the young that normally have the highest income growth potential and are the most mobile, and the 
adverse impact of inequality seems to have hurt these households most. In other words, income growth is 
potentially higher for younger households, but those in more unequal villages experience stunted growth. 

In column (2) for each time period, we report the interaction effects for income. The most striking 
finding is that the interaction effects for income quartile are individually and jointly insignificant for all 
time periods. The adverse effects of inequality hurt rich and poor equally. In terms of implications about 
the mechanism, we view this as speaking most clearly to credit market explanations: the effect of 
inequality exists even when controlling for a rich set of household characteristics, and furthermore, affects 
rich as much as poor households within a village. This is not what we would expect to observe if 
imperfect credit markets drove the relationship. Instead, the effect of inequality spills over households 
across the income distribution, though with less impact on households with higher education. The only 
nuance of the income interactions worth noting is that in 2001-02, where we estimated an insignificant 
impact overall, we do estimate significant negative effects of inequality on the bottom two quartiles. 
While not significantly different than the effect on their richer neighbors, to the extent that there is an 
impact of inequality in the most recent time period, it does seem to fall on the poorest households. The 
effect of inequality does not completely go away for those quartiles. 

4.4 Regression Results: Composition of Income 

We may be able to learn more about the nature of the effect of inequality by investigating links 
between inequality and other measures of economic development. One potentially informative dimension 
is the composition of household income, which will reflect changes in the economic structure of villages, 
and households' ability to participate in more lucrative economic opportunities, especially in non- 
agricultural employment. To conduct this exercise, we make a slight modification to the base 
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specification (Equation (2)), replacing the dependent variable by the share of income earned by a 
household in a given activity, A: S AivT . The activities are "Agriculture", "Wage Income," and "Family 

Businesses." 27 The omitted category is "Other" income. We also add as controls the household's share of 
income from each activity in the initial time period. In this way, we are estimating the effect of inequality 
on the change in share of income in each activity, controlling for initial household participation in the 
portfolio of activities. Did high inequality distort household evolution or movement into various income 
generating activities? The regression specification is therefore: 

3 

S M .vr = <So + + 7%-,«-i + I>A„«-i + <?i In y,, f _, + <5 2 In y vt _ m + S v IQ vt _ m + u AivT (3) 

A=l 

Again, our base specification is the same as column (5) of Table 2. Core results are shown in columns (1) 
to (3) in Table 4, for the two most recent time periods, 1999-00 and 2001-02 to keep the dimension of 
discussion manageable. We also introduce interactions between initial inequality and education and 
cohort following the discussion of Table 3, presented in columns (4) through (6). 

For neither period do we find any effect of inequality on the composition of income. The signs 
are consistent with inequality reducing participation in wage-earning activities, and increasing the share 
of income in agriculture, but the estimates are imprecise. The interaction results are more interesting. For 
both periods we see that inequality significantly affects the ability of lower education households to move 
into wage-earning activities inside or outside the village, tilting them instead towards the less lucrative 
agricultural sector. This does not "explain" the growth results, but it does help with some of the 
accounting: the adverse impact of inequality appears to operate by limiting access of households to 
higher-income off-farm employment opportunities. Stated differently, in more equal villages, everyone is 
able to participate in off-farm employment if they otherwise are able to do so. In the unequal villages, 
those households with low education are "trapped" in farming, and experienced lower income growth, 
especially through the 1990s when crop prices were low. 28 By 2001-02, as crop prices started to recover, 
such households were still disproportionately involved in agriculture, but their overall incomes were less 
adversely affected by their having lived in high inequality villages in 1987-88. 



Using a share-based specification allows us to estimate this equation at the household-level, even when 
households have "zero" income from a particular source. The overall response will capture the combined movement 
from zero to positive (i.e., participation), as well as the level of income from a given activity. 

28 This is also consistent with the prediction of Chong and Gradstein (2007b) that households in high inequality 
settings may operate disproportionately in the "informal sector" to avoid predation by the higher income households 
through local institutions. We doubt that this specific mechanism operated in Chinese villages, and that the 
education interactions instead point to better educational opportunities, or higher returns to education, in lower 
inequality villages. 
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4.5 The Potential Role of Village Heterogeneity 

We find robust evidence that initial village inequality has a long-lasting, but fading, effect on 
household growth rates, but as we noted in Section 3.2, this is insufficient evidence to establish a causal 
link between inequality and growth. While we can rule out a large set of mechanical and other 
endogeneity explanations, our most serious reservation is that inequality reflects other initial conditions 
that are correlated with both inequality and growth, which we denoted W _, . The conventional approach 

for dealing with that sort of unobserved heterogeneity is to control for village fixed effects through a first- 
differenced implementation of the growth equation. If we can track repeated episodes of growth, i.e., 
first-difference growth rates, then this exercise is potentially informative. As is well known, however, 
there are significant limitations to examining growth rates over short periods in first-differenced models 
because the covariation of high-frequency measures of growth and inequality after controlling for village- 
level unobservables may not be sufficient to identify the relationship between inequality and growth. 
Inequality is a slowly-evolving variable, and even with national-level samples, period-to-period changes 
may be driven largely by measurement error. In the context of rural China over this time period, 
measurement error is an obvious concern. Moreover, we are also concerned about simultaneity bias, as 
growth itself may change the income distribution (a "Kuznets Process"). 30 Moreover, some of the core 
implicit assumptions about the error term necessary for the panel-based approach to work may be violated 
in our data. Nevertheless, the exercise is worth conducting, as it helps to reveal important aspects of the 
inequality-growth relationship. 

Consider the base village-level equation: 

g vt = cc + + cc[ In y vM + a v IQ vt _ x + 9 V + e vT (4) 
where growth is defined as that between period t-\ and t, lny w -lny w _, . Note that in keeping with the 
focus on household growth, we examine the average household growth rate, not the growth in average 
incomes. Our empirical objective is to determine whether our estimate of a v is robust to controlling for 

village "fixed effects." Implementation requires addressing two immediate concerns: constructing a 
village-panel data set, and using an estimator that accounts for the potentially dynamic error structure 
implicit in Equation (4). To construct the village-panel we use the balanced panel of households to 
construct the growth-rate and mean log income variables for each period, symmetric with the household 
regressions, and the larger non-attritted samples to construct the key village covariates like village 

29 See, for example, Forbes (2000) and Banerjee and Duflo (2003) for a discussion of the merits and pitfalls 
associated with the panel-data ("fixed effects") approach in cross-country data. 

30 See Kuznets (1955). Ravallion and Chen (2007) explore precisely this question as they investigate linkages 
between growth rates and various measures of poverty and inequality using nationally representative NBS data. 
Panel analyses using state-level data from the US also finds evidence of a Kuznets process leading to a positive 
association between inequality and growth (Frank, 2009). 
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inequality. We have seven available two-year periods: 1987-88, 1989-90, 1991-93, 1995-96, 1997-98, 
1999-00, and 2001-02. 31 In all specifications we include the core set of village covariates (controls for 
initial mean log income, education, land, dependency ratio, composition of village income, geography, 
and province dummies), but we also add period dummies. We address the dynamic panel data issues 
through the use of a variety of estimators, and discuss details in Appendix 2. 

The key results are presented in Table 5. We vary the estimation in two main dimensions: First, 
by estimator, and second by the length of the difference. Longer differences may be more robust to high- 
frequency variation of inequality driven by noise, and may better capture the "long run" effects of 
inequality on growth. 32 We also show results for both measures of inequality used in the cross-section 
analysis: the Mean Log Deviation, and the 90-10 ratio. In the first column we report the pooled OLS 
estimates of Equation (4), with no account for fixed effects but with standard errors clustered at the 
village-level. The estimated effect of inequality is negative, but very small, and statistically insignificant. 
Growth from one period to the next is unrelated to previous inequality when we pool growth and 
inequality from the entire period. In the second column, we take first-differences to sweep out the fixed 
effects. Whether we take short (one period) or long (up to four period) differences, we find no significant 
effect of growth on inequality. If anything, the effect is positive, though insignificant, as one might expect 
if growth caused inequality. Accounting for dynamic panel data issues does nothing to change the 
conclusion: when we difference to eliminate the village fixed effects, there is no relationship between 
inequality and growth. 

On the surface, this suggests that unobserved village-level heterogeneity may be driving our 
previous results. However, the evidence is more complicated. As a starting point, note that the inequality- 
growth relationship vanishes before we control for village fixed-effects in the pooled OLS specifications 
of Table 5. Admitting evidence of the impact of inequality from periods after 1987-88 seems to diminish 
the estimated effect of inequality. Alternatively, it could be that short-run estimates of growth are less 
affected by inequality. That is unlikely to be true, however, as we saw in Table 2 that the effect of 
inequality is strongest early in the growth trajectory. To address this question we break apart the village- 
panel into a series of pair-wise cross-sections, and estimate the village-level cross-section specification 
with varying beginning and endpoints. The results are quite striking and reported in Appendix Table 3. 
We find a significant negative relationship between growth and inequality in 1987-88 and every endpoint, 
except 2001-02 (as in Table 2), though even for 2001-02 the result is significant for the 90-10 ratio. 
However, when using pairs of endpoints with the starting period after 1987-88, we do not find a 

31 Note that there was no survey in 1992 or 1994, so the period 1991-93 is constructed slightly differently from all 
of the other periods with adjacent years (1991 and 1993 being non-adjacent). 

32 Note that with the longer differences, given that the IV -based procedures are based on lagged observations, it is 
not possible to properly implement the FD-IV or WG estimators given the short duration of our series. 
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relationship between inequality and subsequent growth. Specifically, by 1989-90, village inequality loses 
its predictive power for subsequent growth. This means that to the extent that village inequality matters, it 
was inequality, or factors correlated with inequality, at the outset of reforms that affected subsequent 
growth. While robust to a battery of specifications, the 1987-88 results do not hold for subsequent sets of 
initial conditions. Because we do not have data prior to 1987-88, we cannot evaluate whether there is 
something specific to this period alone, or whether it represents the end of the early or pre-reform period. 
What we can conclude, however, is that whatever predictive content is contained in 1987-88 village 
inequality, it is gone for subsequent periods. 

To some extent, this should not be surprising: there is no reason to believe that the determinants 
of inequality in 1987-88 were the same as in subsequent periods. The sources of income variation across 
households evolved with economic and market development, as well as being driven by economic shocks. 
Indeed, we can see this in the evolution of village-level inequality over the time-period. In Figure 3 we 
illustrate this by plotting the evolution of village-level inequality as a function of initial inequality, with 
the 45-degree line (no change in inequality) as reference. A few points are worth noting. First, village 
inequality is generally, though not universally, rising, as most villages have inequality above the 45- 
degree line. Second, there is general convergence of the levels of inequality across villages: increases of 
inequality were systematically highest for those villages with the lowest levels of inequality. The 
"experiment" that generated differences of inequality across villages in 1987-88 is not likely the same as 
that which generated changes of inequality between 1987-88 and 2001-02. The panel data exercise is thus 
likely doomed from the outset: It is predicated on variation in the explanatory variables being comparable 
between and within villages. Combined with results from Table 2 that showed that the effect of inequality 
on growth was not constant in the first place, the village-panel probably cannot be used to address the 
question of whether the growth-inequality relationship is driven by fixed heterogeneity. However, the 
exercise is still revealing as to the limited nature by which increases in inequality potentially mattered 
over this time period: After its initial impact from 1987-88, there is no evidence of a link between 
subsequent inequality and growth. 

5.0 Conclusions and Interpretation 

If in 1987 a compulsive gambler wagered that between two otherwise identical Chinese villages, 
the low inequality village would be richer in 1 997 than the high inequality village, he would likely win. 
Our estimates suggest that if the difference in the Mean Log Deviation was 0.09, i.e., the difference 
between the 10th and 90th percentiles of inequality in 1987, the average annual growth rate for 
households in the low inequality village would be 1.8 percentage points higher relative to a median 
household annual growth rate of 3.4 percent. By this standard, high inequality was a robust and 
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economically significant predictor of slower growth. If he made the same bet for growth fifteen years out 
(to 2002), it would be a closer call, though a better than fair bet. Beyond that, however, if he was greedy 
enough to ride the inequality horse in other dimensions, placing money on rising inequality as a marker of 
lower growth, it would be no better than a crap shoot. The preponderance of evidence we find suggests 
that inequality did not have a reliable causal impact on growth. We do not believe that higher inequality 
impeded growth in rural China. 

This conclusion rests on several strands of evidence. First, by using household-level data we are 
able to rule out imperfect credit markets as the source of the causal relationship between observed 
inequality and growth. Instead, the results using household-level data point to institutional features of 
villages at the outset of reforms as underlying the correlation between inequality and growth. Important 
though those institutions may have been, whatever they were, there is little to suggest that they were a 
causal consequence of inequality. This is most clearly seen when we exploit time-series variation of 
inequality within-villages (in the panel), and all links between growth and inequality disappear. Even if 
we keep the cross-section approach, but use village inequality from the 1990s as the basis of predicting 
growth, we find that inequality ceases to matter: inequality in the 1990s is different from inequality earlier 
in the reform period. Even focusing on 1987 inequality, the source of a safe bet for growth to 1997, we 
find that the effect of inequality fades by 2002. 

We still learn something about the inequality-related determinants of growth in rural China. For 
example, our evidence points to the likelihood of growth-impeding polices that were associated with 
higher inequality. The policies seemed to trap their victims, especially the lower-educated in agriculture 
(a relative dead end during much of this period), and impeded their movement into more lucrative labor 
markets. We cannot be sure exactly what these institutions might be, but likely candidates include those 
affecting household returns to running small businesses, the flexibility accorded to households in meeting 
local grain quotas (i.e, the degree to which these commitments could be made with cash instead of grain), 
and the costs of getting local government permission to migrate to take advantage of newly emerging job 
opportunities (deBrauw and Giles, 2008). Whatever the policies or institutions were, their effects eroded 
very quickly in the reform period. Our data do not allow us to determine whether high inequality caused 
such policies, or reflected a more general level of dysfunction at the local level as villages embarked on 
the reform process. 

More generally, and setting aside the profound problem of causality, our results show the value of 
using household-level data to address the question of whether inequality has a purely external impact on 
growth. Indeed, there is no better way to distill the external effect from that due to aggregation. 
Furthermore, the household-level data can be used to explore the heterogeneity of responses that may be 
informative about the underlying economic mechanisms. Such evidence is especially useful in ruling out 
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factor-market explanations, and pointing by implication to an institutional class of explanations. Our 
excellent cross-village data, with all the benefits of clean and comparable measurement of inequality and 
growth over time, however, also show the limits of how much can be ultimately learned in a growth- 
inequality regression. Unless a researcher is willing to believe that both the underlying relationship 
between growth and inequality is stable, and all variation in inequality is driven by the same "treatment," 
then little that is definitive can be learned from this exercise. While it is difficult to believe that either 
condition holds in the Chinese context, it is even harder to believe that such conditions hold in a cross- 
country setting. 
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TABLE 1 

Sample Summary Statistics 

Mean Percentiles 

10th 50th 90th 



1. Household-level Data (Panel households; N = 3,424) 
Per Capita Income (Constant 1986 Yuan): 



1987-88 


526 


229 


448 


896 


1990-91 


518 


234 


425 


870 


1995-96 


709 


316 


593 


1,192 


1997-98 


742 


330 


630 


1,223 


1999-00 


749 


310 


614 


1,312 


2001-02 


826 


334 


678 


1,443 


Annualized Growth Rates from 1987-88 to: 










1990-91 


-0.009 


-0.178 


-0.009 


0.157 


1995-96 


0.036 


-0.045 


0.039 


0.116 


1997-98 


0.034 


-0.034 


0.034 


0.102 


1999-00 


0.027 


-0.035 


0.029 


0.089 


2001-02 


0.030 


-0.026 


0.031 


0.085 


r^omrinsitinn nf Inrnmc fStharpsV 

v \ i ill ijwiii i i\j ii yi i ii*_wiiii_ \kJ7iiti.i^ai« 










Agriculture, 1987-88 


0.55 


0.17 


0.56 


0.89 


Waees 1987-88 


0.22 


0.00 


0.13 


0.61 


Business, 1987-88 


0.14 


0.00 


0.04 


0.45 


Agriculture 2001-2002 


0.36 


0.03 


0.31 


0.79 


Wages, 2001-2002 


0.34 


0.00 


0.31 


0.79 


Business, 2001-2002 


0.16 


0.00 


0.00 


0.59 


Other Household Characetristics in 1987-88: 










Household Education 


5.56 


2.88 


6.00 


9.00 


HniisphfilH Size 


4.77 


3.00 


5.00 


7.00 


Dependency Ratio 


0.43 


0.20 


0.50 


0.60 


Cultivated I and fMu^ 


1.46 


0.50 


1.17 


2.81 


Head Age <= 30 


0.09 








Head Age between 3 1 and 40 


0.33 








Head Age between 41 and 50 


0.35 








Head Age between 5 1 and 60 


0.16 








Head Age 6 1 and over 


0.06 








II. Village-level Characteristics in 1987-88 (N = 82) 










Inequality of Per Capita Household Income: 










Gini Coefficient 


0.20 


0.14 


0.19 


0.28 


Mean Log Deviation 


0.07 


0.03 


0.06 


0.12 


90-10 Ratio 


2.56 


1.91 


2.46 


3.39 


Other Characteristics: 










Household Education 


5.53 


4.00 


5.66 


7.13 


Share of Income from Agriculture 


0.54 


0.25 


0.59 


0.74 


Share of Income from Wages 


0.20 


0.07 


0.16 


0.37 


Share of Income from Family Enterprises 


0.15 


0.05 


0.13 


0.30 


Village Total Tax Revenue Per Capita 


0.46 


0.06 


0.31 


0.92 


Village Total Public Expenditure Per Capita 


0.46 


0.06 


0.30 


0.90 


Mountainous Terrain 


0.26 








Hilly Terrain 


0.39 








Near a City 


0.06 









Notes: 

1) Household-level statistics are calculated over the 3,424 panel households, while the village-level 
statistics are calculated over all available households in the 82 villages in 1987-88 (4,847) 
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TABLE 3 

Exploring Interaction Effects: The Effect of Inequality on Growth by Education, Cohort, and Initial Income 



(Cluster-corrected Standard Errors in Parentheses) 





1995-96 


1997-98 


1999-00 


2001-02 




(1) 


(2) 


(1) 


(2) 


(1) 


(2) 


(1) 


(2) 


Base Effects: 


















Inequality (MlnD) 


-0.726" 


-0.257 


-0.344* 


-0.188 


-0.268* 


-0.131 


-0.142 


-0.074 




(0.309) 


(0.181) 


(0.092) 


(0.104) 


(0.078) 


(0.086) 


(0.082) 


(0.078) 


HH Education 8788 


0.000 


0.004* 


-0.002* 


0.000 


-0.001 


0.001 


0.000 


0.000 




(0.003) 


(0.001) 


(0.001) 


(0.000) 


(0.001) 


(0.000) 


(0.001) 


(0.000) 


Age <= 30 


0.001 


0.002 


-0.001 


0.003 


0.004 


0.009 


-0.001 


0.009 




(0.013) 


(0.011) 


(0.005) 


(0.005) 


(0.004) 


(0.005) 


(0.004) 


(0.006) 


3 1 <= Age <= 40 


-0.021 


-0.024* 


0.000 


-0.007* 


0.005 


-0.005* 


0.000 


-0.004* 




(0.011) 


(0.005) 


(0.004) 


(0.002) 


(0.003) 


(0.002) 


(0.004) 


(0.002) 


5 1 <= Age <= 60 


0.002 


-0.196 


-0.009* 


-0.054 


-0.007* 


-0.075 


-0.010* 


-0.079 




(0.007) 


(0.153) 


(0.002) 


(0.084) 


(0.002) 


(0.070) 


(0.002) 


(0.069) 


Age > 60 


-0.034" 


-0.035* 


-0.018* 


-0.018* 


-0.017* 


-0.017* 


-0.019* 


-0.019* 




(0.009) 


t f\ f\r\cx\ 

(0.009) 


(0.004) 


(0.004) 


(0.003) 


ff\ f\f\ \ 

(0.003) 


(0.003) 


(0.003) 


Income Quartile (Q 1 ) 




0.002 




0.000 




0.005 




0.005 






(0.008) 




(0.006) 




(0.005) 




(0.006) 


Income Quartile (Q2) 




0.001 




0.003 




0.009 




0.009 






(0.007) 




(0.005) 




(0.005) 




(0.006) 


Income Quartile (Q3) 




0.004 




-0.001 




0.002 




0.004 






(0.006) 




(0.005) 




(0.005) 




(0.005) 


Interactions: 


















Inequality * Education 


0.064 




0.031* 




0.024* 




0.008 






(0.038) 




(0.010) 




(0.009) 




(0.009) 




Inequality * Under 40 


-0.046 




-0.108 




-0.134* 




-0.052 






(0.159) 




(0.056) 




(0.039) 




(0.055) 




Inequality * Ql 




-0.067 




-0.054 




-0.075 




-0.079 






(0.099) 




(0.084) 




(0.070) 




(0.069) 


Inequality * Q2 




-0.015 




-0.034 




-0.103 




-0.084 






(0.099) 




(0.073) 




(0.069) 




(0.087) 


Inequality * Q3 




-0.032 




0.028 




-0.003 




-0.006 






(0.085) 




(0.069) 




(0.067) 




(0.067) 


Combined Effect by Quartile: 


















Combined Q 1 




-0.281* 




-0.242* 




-0.206* 




-0.154* 






(0.091) 




(0.094) 




(0.078) 




(0.074) 


Combined Q2 




-0.230* 




-0.221* 




-0.234* 




-0.158* 






(0.102) 




(0.090) 




(0.077) 




(0.080) 


Combined Q3 




-0.246* 




-0.160 




-0.134 




-0.081 






(0.088) 




(0.086) 




(0.075) 




(0.077) 


F-Interactions 


1.23 


0.33 


6.48* 


0.77 


8.21* 


2.06 


1.51 


1.59 




(0.2980) 


(0.8005) 


(0.0024) 


(0.5162) 


(0.0006) 


(0.1121) 


(0.2264) 


(0.1982) 



Notes: 

1 ) Each specification is based on household-level specification with full household and village covariates and Jacknifed 
Inequality (Column (5) of Table 2) 



2) For each endpoint, we estimate the household specification with separate sets of interactions for Education and Cohort, and 
Income Quartile. The Cohort Interaction is based on an indicator of whether the household head was 40 years or younger in 
1987-88. 

3) "F-Interactions" is the F-statistic for the null hypothesis of whether the interaction effects are jointly zero. 

4) "Combined Q 1 " (etc.) are the total effects of inequality on income growth for households in a specified 1 987-88 Income 
Quartile 
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FIGURE 1 

The Growth-Inequality Relationship: Various Endpoints 
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.05 .1 .15 .2 

Village Inequality in 1987-88 (Mean Log Deviation) 

• 1990-91 — — — - Fitted values • 1995-96 Fitted values 

• 1997-98 Fitted values • 1999-00 Fitted values 

• 2001-02 Fitted values 



Notes: 

1) The fitted values are based on simple regressions of growth from period 1987-88 through period T, on 
initial inequality in 1987-88. The regressions are weighted by the number of households in the village 
sample. The number of villages is 82. 

2) The regression coefficients (and ^-values) are for 1990-91: -0.084 (0.46); for 1995-96: -0.157 (1.87); 
for 1997-98: -0.215 (2.94); for 1999-00: -0.133 (2.04); for 2001-02: -0.130 (2.30). 



FIGURE 2 

The Relationship Between Inequality and the 90-10 Ratio, 1987-88 




Notes: 

1) Each scatter plot is a graph of village inequality (for the Mean Log Deviation, or the Gini Coefficient) 
versus the village 90-10 ratio in 1987-88. The relationship represented in these plots underlies the 
"reduced form" or "first stage" for the correction of potential measurement error of inequality, using 
the 90- 1 ratio as an instrument. 

2) For the Mean Log Deviation, the lvalue of the coefficient on the 90-10 ratio is 24.68 (F = 609.26), 
while similar f-value for the Gini cofficient regression on the 90-10 ratio is 20.38 (F = 415.43). As an 
aside, the f-value of a regression of the Mean Log Deviation on the Gini Coefficient is 46.70 (F = 
2180.66). 



FIGURE 3 

The Relationship Between Inequality in 1987-88 and Subsequent Inequality 




Notes: 

1) This graph plots village-level inequality in subsequent time periods against initial inequality in 1987- 
88. 

2) We show the 45-degree line along which villages would be if inequality were unchanged over time. 

3) We also show the fitted values of a regression of inequality in the final period (2001-02) as a function 
of inequality in 1987-88. 



Appendix 1: Relating the Household and Village-level Specifications 



As our estimation is conducted at the household-level, we avoid most issues of aggregation. However, it 
is still worth relating our approach (Equation (1)) to the more conventional village-level specification, 
especially since all of the key variation (of inequality) resides at the village-level. To begin with, consider 
a simplified village-level specification relating growth to inequality (the Mean Log Deviation): 

g vT = ln y vT - InJV, =A> + A ln >Vi + P v (in 3V, - taJV,) + £ w (Al.l) 

Note that this is a departure from the previous literature as we define growth as the difference in the mean 
log incomes (i.e., average household growth rate), as opposed to the change in log mean incomes. This 
allows cleaner aggregation from the household-level. There are several ways to obtain (Al.l) by 
averaging a household specification. 

Consider first a household-level model that allows for an external effect of mean village (log) 
income, as well as own income, plus village income inequality: 

S UvT = hj'/.w -hJV-i = «o + «i ta >Vi + a 2 ln >Vi +« (injV, -lnj vM ) + £. vr (A1.2) 

This averages by village to: 

g vr = «„ + («, + a 2 )ln>Vi + « (injv, - In j v( _ 1 ) + £ ; (A 1.3) 

Notice that the coefficient on lnj v( _j captures the combined effect of own (log) household income, and 

mean (log) village income. The effects are not separately identified at the village-level. This is related to 
the main point raised in Ravallion (1998), though the confounding of the own-income effect with 
inequality is not an issue with our specification, as we use "mean logs" as opposed to "log means" as our 
key covariates. 

A second way to obtain a version of (1.1) is to specify the household-level equation as: 
g. T = a,+a. ln y. , , + a.ln y. ,+a (In y , , - ln v. , , ) + £. „ 

<->i,vr 1 - 7 z,vZ-l 2 vt-l v\ vl-\ J i,vl-\J i,vT (Al 4) 

DEV i 

The key difference between (A 1.4) and (A 1.2) is that the inequality measure is 
DEV jv[ _ x = lnj w _, — lay , , the amount by which household i's income is less than the log of the mean 

income in the village (controlling for both the independent effect of own log income, and the mean log 
income in the village). This equation also aggregates to the same village-level regression (Al.l). At the 
village-level, we cannot tell whether it is MLD or DEV that affects average household income growth. 
From our perspective, the distinction is not important, as there is no "right" measure of inequality. 



Inequality and Growth in Rural China, Appendices: Page 1 



Mechanically, the village-level equation (A 1.1) can be estimated using the household-level data 
and specification (A1.4) by 2SLS using as instruments either a vector of village dummies, or identically a 

vector of village means, In y vt _ v MLD vt l (see, for example, Angrist and Pischke (2009) and "Visual 

Instrumental Variables"). As long as the weights are correct (i.e., use the number of households in a 
village as weights in the village-level specification), then the household (2SLS) and village-level (WLS) 
specifications will yield identical coefficients for a v ,/3 v . This exercise is conducted Column (9) of Table 

2, which is compared to Column (8) of Table 2. 

While it is conventional to estimate the village-level (or country-level) regression using village 

sample averages and measures of inequality (e.g., In y vt _ v MLD vt l ), there are small-sample and 

measurement error issues with using sample averages as "proxies" for population moments (Deaton, 
1985). Devereux (2007) shows, however, that the Deaton measurement- error estimator is identical to a 
Jack-Knife Instrumental Variable (JIVE) estimator, where Jack-knifed sample means serve as instruments 
for individual-level variables. The intent in that exercise, it should be noted, is NOT to estimate or 
identify the individual-level specification, but to correct for measurement error in the village-level 
(aggregate) specification. Adapting Devereux's framework, the implied "structural" equation at the 
household-level is: 

g. vT = q„ +(a l +« 2 )lny. w _ 1 + « 3 (ln jy, -In jyj + e,^ 

DEY l i4 _. 

Correcting for measurement error (implementing Deaton's estimator using Devereux's Jack-knife result) 
simply involves using lnj v; _ 1(i) and MLDvt-m) as instruments for \ny ivt _ x and DEV. vt l . The implied 
reduced form equation for g ivT in this approach is: 

S hvT = n + n i 111 JV-ko + K 2 MLD vt -\(i) + v. vT (Al .6) 

Except for our inclusion of covariates (which can be easily incorporated into Devereux's framework), this 
reduced form (A1.6) is identical to our structural equation (1). In other words, our main estimating 
equation can be seen as "highly similar" to the implied reduced-form associated with Deaton's 
measurement error correcting group-means estimator. 1 It is not necessary that our model be imbedded 



1 There is one other bit of slippage: in this discussion we treat lnj vM as a known parameter. Of course, it is not. 

However, because the sample mean is inside the non-linear logarithm function, the linear algebra associated with the 
JIVE estimator does not hold exactly (without treating it as a constant). In our empirical work, however, we use the 
Jack-knifed version of y vt _ Ui) inside the logarithm to address "in spirit" the same small-sample (measurement error) 

problem. Note that this only effects the degree to which our equation (1) is an exact reduced form for the 
Deaton/Devereux specification. 
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within this framework, as Equation (1) is a bona fide structural model in its own right. However, equation 
(A1.6) allows us to better illustrate the advantages of using the household-level data to correct for a 
variety of potential econometric problems associated with using the aggregated data, and to precisely 
highlight the role of aggregation. 

Appendix 2: Expanded Discussion of Panel Data Models 

The discussion in this appendix provides further detail on the panel data analysis presented in 
Table 5 and discussed in Section 4.5. As noted in the body of Section 4.5, the analysis summarized in 
Table 5 first examines the relationship described in Equation (4) between growth and lagged inequality 
over short time periods, and then implements first-differenced models that eliminate the effects of fixed 
village unobservables. It then examines whether controlling for biases potentially introduced if growth 
and lagged income are both systematically related to a dynamic unobservable in the error term. 

The base OLS specification is presented in column (1) and suggests that when short-term growth 
rates are regressed on lagged inequality, the results are statistically insignificant whether we use the Mean 
Log Deviation or the 90-10 ratio as our inequality measure. Column (2) shows straight first-differenced 
(FD) models that sweep out unobserved fixed effects, employing differences of increasing duration from 
short (one-period) to long (four-period) differences. In these first-differenced models, there is no 
significant effect of growth on inequality, and if anything, there is a positive (though insignificant) effect 
as one expects with a Kuznets process. 

Dynamic issues arise with the concern that change in growth and lag growth (or lag change in 
levels) will be may be systematically related through a common error component. Remember that change 

in growth (the dependent variable) is calculated as y&y it — \a.y jt _^ — \\n.y u _ l -lny i( _ 2 j, and the change in 

lagged levels (a regressor) is (lny,,^ -lny ( .,_ 2 j. Common error in periods t — l or t — 2 may lead to a 

mechanical negative correlation between the change in growth and the change in the lagged levels. More 
importantly, the lag change in the inequality measure (Mean Log Difference) has components of the same 
error term as the mean log level of income. We thus estimate instrumented versions of the FD models 
(FD-IV) models, in which we employ as instruments period t — 3 values of the levels of lagged income 
and lagged inequality. At least mechanically in this model, period t — 3 measures of these variables are 
not part of either the dependent variable, or the lagged changes of our main regressors. As long as the 
t — 3 levels of these variables significantly predict the regressors (which are changes), and are in turn 
uncorrelated with the error term, then these lagged values will be eligible instruments. We implement this 
procedure in the third column of each panel, for the first-difference specifications only. Note that while 
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imprecisely estimated (and statistically insignificant), the magnitude of the coefficient on the difference in 
lagged inequality has increased by an order of magnitude for the model using mean log deviation, and the 
coefficient on the 90-10 ratio remains positive and insignificant. Results from this exercise should be 
treated with some caution, however, as the instruments, while conventional are relatively weak: The 
Kleibergen-Paap Wald F-statistic is between 2.15 and 2.42 for the mean log deviation and 90-10 models, 
respectively. 

Next, we implement the "WG Estimator," as an alternative procedure for estimating our dynamic 
panel-data specification. The WG Estimator essentially provides a fixed effects approach to estimating the 
dynamic panel model. The key assumption behind this model is that the main source of bias is introduced 
through the lag growth of income term. In this model, the focus is on addressing the lagged dependent 
variable only, and we treat the lagged inequality measure as exogenous. This reduces the demands on 
finding instruments (which are lagged regressors, as before). Again, we observe a positive and 
insignificant relationship between one-period lag inequality and subsequent growth for both measures of 
inequality. 

Finally, we compare the performance of the WG estimator to an FD-IV model also treating 
inequality as exogenous (as opposed to endogenous, as in the FD-IV specification). We denote this "WG- 
Comp" and show the results in the fifth column for each measure of inequality. The WG-Comp estimator 
yields a similar estimate to the straight WG estimator: there is no relationship between lagged inequality 
and growth, once we account for error structures that include a "fixed effect." 
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APPENDIX TABLE 1 
Attrition Equations 
Probit Estimates: Probability of Being in the Panel Estimating Sample 
(Marginal Effects, Standard Errors in Parentheses) 

In Panel Sample 



(1) (2) 

Initial Village Inequality (Mean Log Deviation) -0.516 -0.920 

(1.698) (1.726) 

Household Variables (in initial period, 1987-88): 

Log Initial Income 18.614 -0.195 

(5.633) (0.063) 
Log Initial Income - Squared -2.873 

(0.908) 

Log Initial Income - Cubed 0.144 

(0.049) 

Share of Income from Agriculture in 1987-88 0.180 0.197 

(0.217) (0.217) 

Share of Income from Wages in 1987-88 0.082 0.093 

(0.192) (0.196) 

Share of Income from Family Businesses in 1987-88 0.010 -0.012 

(0.185) (0.184) 

Household Education in 1987-88 -0.004 -0.003 

(0.010) (0.010) 

Log Household Size 0.087 0.097 

(0.074) (0.073) 

Dependency Ratio 0.183 0.190 

(0.143) (0.142) 

Log Cultivated Land 0.009 0.010 

(0.023) (0.024) 

Head Age <= 30 -0.089 -0.094 

(0.078) (0.078) 

Head Age between 31 and 40 0.030 0.029 

(0.052) (0.052) 

Head Age between 51 and 60 -0.184 -0.180 

(0.065) (0.064) 

Head Age 61 and over -0.275 -0.257 

(0.098) (0.097) 

Village Variables (in initial period, 1987-88): 

Mean Log Per Capita Income -0.095 -0.074 

(0.256) (0.259) 

Average Education 0.049 0.054 

(0.065) (0.065) 

Cultivated Land Per Capita -0.115 -0.093 

(0.153) (0.155) 

Village Dependency Ratio 0.193 0.369 

(1.598) (1.583) 

Village Share of Income from Agriculture -0.652 -0.563 

(1.492) (1.458) 

Village Share of Income from Wages -0.650 -0.504 

(1.522) (1.500) 

Village Share of Income from Family Businesses -0.677 -0.462 

(1.700) (1.642) 

Village Tax Revenue Per Capita 0.954 0.847 

(0.427) (0.432) 

Village Government Expenditure Per Capita -0.846 -0.769 

(0.405) (0.411) 

Sample Size (1987-88) 4847 4847 

Notes: 



1) Dependent variable is the probability of being in the primary estimating sample (observed in all years from 
1987 through 2002, with complete variable set) 

2) All specifications include village location and province dummies 

3) Statistically significant coefficients highlighted (at the 5% level) 

4) Specification ( 1 ), with the cubic in household income, is used to generate inverse probability weights for 
the attrition corrections throughout the paper. 



APPENDIX TABLE 2 



Additional Covariates (Corresponding to Table 2, Column 5) 



(Cluster-corrected Standard Errors in Parentheses) 


Endpoint: 


1990-91 


1995-96 


1997-98 


1999-00 


2001-02 


Initial Inequality (MlnD, Jacknifed) 


-0.354* 


-0.240* 


-0.197* 


-0.173* 


-0.111 




(0.173) 


(0.084) 


(0.085) 


(0.071) 


(0.068) 


Household Variables: 












Ln Initial Income (YO) 


-0.321 


-0.133 


-0.249 


-0.024 


-0.252 




(0.938) 


(0.234) 


(0.278) 


(0.210) 


(0.185) 


(Ln YO)-Squared 


0.014 


0.009 


0.028 


-0.009 


0.027 




(0.149) 


(0.039) 


(0.046) 


(0.034) 


(0.030) 


(Ln Y0)-Cubed 


0.000 


-0.001 


-0.002 


0.001 


-0.001 




(0.008) 


(0.002) 


(0.003) 


(0.002) 


(0.002) 


Household Education 8788 


0.004* 


0.002* 


0.000 


0.001 


0.000 




(0.001) 


(0.000) 


(0.000) 


(0.000) 


(0.000) 


Log Family Size 


0.013 


0.011* 


0.008* 


0.005 


0.000 




(0.009) 


(0.003) 


(0.003) 


(0.002) 


(0.002) 


Dependency Ratio 


-0.041* 


0.013 


0.024* 


0.029* 


0.030* 




(0.014) 


(0.007) 


(0.006) 


(0.005) 


(0.005) 


Log Cultivated Land 


0.000 


0.001 


0.000 


-0.002* 


0.001 




(0.004) 


(0.001) 


(0.001) 


(0.001) 


(0.001) 


Age <= 30 


-0.002 


-0.011* 


-0.008* 


-0.006 


-0.005 




(0.009) 


(0.004) 


(0.003) 


(0.003) 


(0.003) 


31 <=Age<=40 


-0.024* 


-0.011* 


-0.007* 


-0.004* 


-0.004* 




(0.005) 


(0.003) 


(0.002) 


(0.002) 


(0.002) 


51 <=Age<=60 


0.002 


-0.006 


-0.009* 


-0.007* 


-0.010* 




(0.007) 


(0.003) 


(0.002) 


(0.002) 


(0.002) 


Age > 60 


-0.035* 


-0.027* 


-0.018* 


-0.016* 


-0.019* 




(0.009) 


(0.005) 


(0.004) 


(0.003) 


(0.003) 


Village Variables: 












Log Mean Initial Income 


0.023 


0.026 


0.034* 


0.028* 


0.017 




(0.028) 


(0.014) 


(0.014) 


(0.011) 


(0.009) 


Avg Education 


-0.001 


0.001 


0.002 


0.003 


0.003 




(0.006) 


(0.003) 


(0.003) 


(0.002) 


(0.002) 


Log Per Capita Land 


0.033* 


0.006 


0.006 


0.003 


0.003 




(0.015) 


(0.008) 


(0.008) 


(0.007) 


(0.006) 


Dependency Ratio 


-0.336* 


-0.109 


-0.080 


-0.120* 


-0.116* 




(0.144) 


(0.066) 


(0.057) 


(0.044) 


(0.037) 


Share of HH Income from Agriculture 


-0.063 


0.035 


0.015 


0.058 


0.045 




(0.143) 


(0.052) 


(0.053) 


(0.049) 


(0.042) 


Share of HH Income from Wages 


0.047 


0.068 


0.019 


0.057 


0.024 




(0.160) 


(0.054) 


(0.053) 


(0.050) 


(0.046) 


Share of HH Income from Fam Business 


-0.004 


0.059 


0.020 


0.093 


0.045 




(0.150) 


(0.067) 


(0.067) 


(0.057) 


(0.051) 


Tax Revenue Per Capita 


-0.028 


-0.001 


-0.011 


-0.005 


-0.006 




(0.056) 


(0.022) 


(0.020) 


(0.015) 


(0.015) 


Gov Expenditure Per Capita 


0.050 


0.003 


0.011 


0.010 


0.015 




(0.061) 


(0.021) 


(0.020) 


(0.016) 


(0.017) 


N 


3424 


3424 


3424 


3424 


3424 



Notes: 



1) All specifications employ attrition weights; 

2) Village mean log income and education are Jack-Knifed. 

3) For further details, see the notes to Table 2 



APPENDIX TABLE 3 
How does initial inequality relate to various subsequent growth rates? 

Varying the Beginning and End Periods, Village-Level "Cross-Section" Specification 

Measure of Inequality: Mean Log Deviation 
Endpoint Period: 

(2) (3) (4) (5) (6) (7) 

1989-90 1991-93 1995-96 1997-98 1999-00 2001-02 

Beginning Period: 



1987-88 


-0.722* 


-0.221 


-0.255* 


-0.234* 


-0.190* 


-0.132 




(0.257) 


(0.137) 


(0.084) 


(0.089) 


(0.078) 


(0.080) 


1989-90 




-0.153 


-0.090 


-0.116 


0.018 


0.025 






(0.214) 


(0.128) 


(0.092) 


(0.077) 


(0.063) 


1991-93 






0.126 


0.099 


0.108 


0.128 








(0.288) 


(0.201) 


(0.122) 


(0.097) 


1995-96 








0.190 


-0.002 


0.105 










(0.239) 


(0.192) 


(0.128) 


1997-98 










-0.077 


0.085 












(0.319) 


(0.185) 


1999-00 












0.285 














(0.184) 



Measure of Inequality: 90-10 Ratio 



Endpoint Period: 

(2) (3) (4) (5) (6) (7) 





1989-90 


1991-93 


1995-96 


1997-98 


1999-00 


2001-02 


Beginning Period: 
1987-88 


-0.047* 

(0.014) 


-0.015* 

(0.007) 


-0.015* 

(0.004) 


-0.015* 

(0.004) 


-0.012* 

(0.004) 


-0.009* 

(0.004) 


1989-90 




-0.004 
(0.008) 


-0.001 
(0.006) 


-0.005 
(0.004) 


0.001 
(0.004) 


0.000 
(0.003) 


1991-93 






0.004 
(0.012) 


0.001 
(0.008) 


0.005 
(0.006) 


0.006 
(0.005) 


1995-96 








0.009 
(0.011) 


0.000 
(0.008) 


0.006 
(0.006) 


1997-98 










-0.009 
(0.018) 


0.001 
(0.010) 


1999-00 












0.008 
(0.008) 



Notes: 



1) All specifications use the village-panel data set, cross-section specification 

2) The reported numbers are the coefficients (and standard errors) of the effect of inequality on growth from a 
regression of growth between period "t" and "t-1" as a function of village characteristics in period "t-1" 



